An Effective Bypass Mechanism to Enhance Branch Predictor for SMT Processors

نویسندگان

  • Yongfeng Pan
  • Xiaoya Fan
  • Liqiang He
  • Deli Wang
چکیده

Unlike traditional superscalar processors, Simultaneous Multithreaded processor can explore both instruction level parallelism and thread level parallelism at the same time. With a same fetch width, SMT fetches instructions from a single thread not so deeply as in traditional superscalar processor. Meanwhile, all the instructions from different threads share the same Function Unites in SMT. All the characteristics make it possible to enhance the performance of SMT through reducing the branch mis-prediction. Based on the fact that about 15% of branch instructions whose directions can be definitely known at predicting cycle, a simple and effective bypass mechanism is proposed. This scheme doesn’t depend on any existed branch predictor, and can be used as an effective enhancement to them. Execution-driven simulation results show that the branch prediction miss rates of our predictor decrease by more than 15% on average compared with a simple base line (g-share) predictor and improve the instruction throughput by about 2.5%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A latency-conscious SMT branch prediction architecture

Executing multiple threads has proved to be an effective solution to partially hide latencies that appear in a processor. When a thread is stalled because a long-latency operation is being processed, like a memory access or a floatingpoint calculation, the processor can switch to another context so that another thread can take advantage of the idle resources. However, fetch stall conditions cau...

متن کامل

Tolerating Branch Predictor Latency on SMT

Simultaneous Multithreading (SMT) tolerates latency by executing instructions from multiple threads. If a thread is stalled, resources can be used by other threads. However, fetch stall conditions caused by multi-cycle branch predictors prevent SMT to achieve all its potential performance, since the flow of fetched instructions is halted. This paper proposes and evaluates solutions to deal with...

متن کامل

Exploring branch target buffer access filtering for low-energy and high-performance microarchitectures

Powerful branch predictors along with a large branch target buffer (BTB) are employed in superscalar and simultaneous multi-threading (SMT) processors for instruction-level parallelism and thread-level parallelism exploitation. However, the large BTB not only dominates the predictor energy consumption, but also becomes a major roadblock in achieving faster clock frequencies at deep sub-micron t...

متن کامل

Building an SMT Application Simulator

It also requires examination of various processor and architecture design decisions. SMT processors may exhibit different cache, branch-prediction, and utilization patterns than conventional processors [10, 9]. While studies of several of these factors have been undertaken, there are many more variables to be examined; each component found on a conventional chip may behave differently when seve...

متن کامل

Evaluating Branch Predictors on an SMT Processor

Simultaneous multithreading (SMT) provides significant increases in microprocessor throughput by issuing instructions from multiple threads per clock cycle. SMT can be realized in a wide-issue superscalar with a modest increase in resources, because much of the hardware is shared among the multiple thread contexts. Branch prediction accuracy, a key component of microprocessor performance, can s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007